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Abstract 

Congenital heart disease (CHD) is the nnost common form of congenital human birth anomalies and a leading cause of 
perinatal and infant mortality. Some studies including our published genome-wide association study (GWAS) of CHD have 
indicated that genetic variants may contribute to the risk of CHD. Recently, Cordell et al. published a GWAS of multiple CHD 
phenotypes in European Caucasians and identified 3 susceptibility loci (rs870142, rsl6835979 and rs6824295) for ostium 
secundum atrial septal defect (ASD) at chromosome 4pl 6. However, whether these loci at 4pl 6 confer the predisposition to 
CHD in Chinese population is unclear. In the current study, we first analyzed the associations between these 3 single 
nucleotide polymorphisms (SNPs) at 4pl6 and CHD risk by using our existing genome-wide scan data and found all of the 3 
SNPs showed significant associations with ASD in the same direction as that observed in Cordell's study, but not with other 
subtypes- ventricular septal defect (VSD) and ASD combined VSD. As these 3 SNPs were in high linkage disequilibrium (LD) 
in Chinese population, we selected one SNP with the lowest P value in our GWAS scan (rsl 6835979) to perform a replication 
study with additional 1,709 CHD cases with multiple phenotypes and 1,962 controls. The significant association was also 
observed only within the ASD subgroup, which was heterogeneous from other disease groups. In combined GWAS and 
replication samples, the minor allele of rsl 6835979 remained significant association with the risk of ASD (0R= 1.22, 95% 
CI = 1.08-1.38, P = 0.001). Our findings suggest that susceptibility loci of ASD identified from Cordell's European GWAS are 
generalizable to Chinese population, and such investigation may provide new insights into the roles of genetic variants in 
the etiology of different CHD phenotypes. 
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Introduction 

Congenital heart disease (CHD) usually refers to abnormalities 
in the heart's structure or function that arise before birth [1]. As 
the most frequent birth defect, CHD affects about 0.8% of hve 
births [2,3], and it is also a leading cause of perinatal and infant 
mortality, causing more than 220,000 deaths globally every year 
[4]. In China, epidemiological studies suggest an obvious 
increasing trend of CHD mortality with the overall mortality rate 
increasing from 141 in 2003 to 229 in 2010 per 10,000,000 
person-years [5]. CHD can be classified into three broad 
categories: cyanotic heart disease, left-sided obstruction defects 
and septation defects [1]. Among them, septation defects are most 



common and can affect septation of the atria (atrial septal defect, 
ASD), septation of the ventricle (ventricular septal defect, VSD) or 
formation of structures in the central part of the heart 
(atrioventricular septal defect, AVSD) [1]. CHD may occur as 
part of recognized chromosomal and Mendelian syndromes [6,7]; 
however, it manifests as a non-syndromic, non-Mendelian 
condition in 80% of cases and may result from a multifactorial 
inheritance model that involves environmental factors and genetic 
predisposition [8] . In the past years, several genes linked to CHD 
have been identified, including T-box transcription factors [TBXl , 
TBX5 and TBX20), homobox transcription factors [NKX2.5 and 
NKX2.6), basic helix-loop-helix transcription factors [HANDl 
and HAND2), and GATA binding protein 4 {GATA4) [9]. 
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More recently, genome-wide association studies (GWAS) have 
emerged as a powerful tool to identify novel susceptibility loci or 
genes of disease [10]. In May 2013, our research group conducted 
a GWA scan with 945 patients of ASD, VSD and ASD combined 
VSD cases and 1,246 non-CHD controls, followed by a two-stage 
vahdation with 2,160 ASD, VSD and ASD combined VSD cases 
and 3,866 controls [11]. We identified 2 loci at lpl2 (rs2474937 
near TBX15 gene, odds ratio (OR) = 1.40, P = 8.44x10"'°) and 
4q31.1 (rsl531070 inMAMLB gene, OR= 1.40, P = 4.99x 10"'^) 
associated with CHD significandy (P<5.0 x lO"**). However, these 
variants may represent only a very small proportion of SNPs 
contributing to CHD risk because of the strict criteria of the 
GWAS significance level (P= 10 ^ or P= 10 % Simultaneously, 
a GWAS of multiple CHD phenotypes in European Caucasians 
was conducted by Cordell el al. In that GWAS, when all CHD 
phenotypes were considered together, no region achieved genome- 
wide significant association. However, 3 SNPs were identified to 
be associated with ASD risk significandy after the stratification 
analysis was performed in dilferent diagnostic groups (rs870142, 
^'combimxi = 2.61xl0"'", odds ratio (OR)= 1.456; rsl6835979, 
/'combined = 2.94x10"'", OR = 1.452 and rs6824295, P,„,„_ 
bi„ed = 9-73x10"'", OR= 1.437) [12]. However, it is uncertain 
whether these associations could be applicable to diverse 
populations. Therefore, we first investigated the association 
between the 3 SNPs (rs870142, rsl6835979 and rs6824295) and 
CIID risk by using our existing genome-wide scan data and then 
performed a replication study with additional 1,709 CHD cases 
with multiple phenotypes and 1,962 controls. 

Materials and Methods 

Ethics Statement 

This study was approved by the institutional review board of 
Nanjing Medical University. The design and performance of 
current study involving human subjects were clearly described in a 
research protocol. AU participants over the age of 18 would 
complete the informed consent in writing before taking part in this 
research. For the participants under the age of 18, parental written 
consent was obtained. 

Study populations 

Study population of GWAS scan has been described previously 
[11]. For the replication stage, 1,709 CHD cases and 1,962 
controls were recruited from the First AffiUated Hospital of 
Nanjing Medical University and the Affiliated Nanjing Children's 
Hospital of Nanjing Medical University, Nanjing, China, between 
March 2009 and May 2013. CHD were diagnosed based on 
echocardiography, some of which were further confirmed by 
cardiac catheterization and/or surgery. Cases that had clinical 
features of developmental syndromes, multiple major develop- 
mental anomalies or known chromosomal abnormalities were 
excluded. Exclusion criteria also included a positive family history 
of CHD in a first-degree relative (parents, siblings and children), 
maternal diabetes meUitus, phenylketonuria, maternal teratogen 
exposures (e.g., pesticides and organic solvents), and maternal 
therapeutic drugs exposures during the intrauterine period. 
Controls were non-CHD outpatients from the same geographic 
areas. They were recruited from the hospitals listed above during 
the same time period. Controls with congenital anomalies or 
cardiac disease were excluded. For each participant, approxi- 
mately 2-ml whole blood was obtained to extract genomic DNA 
for genotyping analysis. 



Quality control for GWAS scan and genotyping of 
replication study 

The GWA scan was conducted using lUumina Omni zhonghua 
chips followed liy a systematic quality control before association 
analysis [11]. In brief, we removed samples with low call rates, 
ambiguous gender, familial relationships, and extreme heterozy- 
gosity rate. We also detected population outiiers and stratification 
using a PCA-based method. Our PCA analysis showed that the 
cases and controls were genetically matched, and the genomic 
control inflation factor (k) was 1.065 [11]. SNPs were excluded if 
they are not mapped on autosomal chromosomes, or have a call 
rate <95%, minor allele frequency (MAF)<0.05 or P< 
1.00x10"^ for Hardy-Weinberg equilibrium. After quality control 
procedures, a total of 708,275 SNPs in 945 CHD cases and 1,246 
controls were included in the final GWA analysis [11]. The 
genotyping analyses in the rephcation study were performed by 
using the TaqMan allelic discrimination Assay (Applied Biosys- 
tems, Inc, USA). The information of the primers and probes were 
provided in Table 1. A series of methods were used to control the 
quality of genotyping: (i) case and control samples were mixed on 
each plate; (ii) genotyping was performed without knowing the 
case or control status; (iii) two water controls were used in each 
plate as blank control; (iv) five percent of the samples were 
randomly selected for repeat genotyping. The accurate concor- 
dance rate of genotyping for tiie SNP rsl6835979 was 100.00% in 
the replication stage. 

Statistical analysis 

Population structure was evaluated by PCA as implemented in 
the software package EIGENSTRAT 3.1. The significant (P< 
0.05) top eigenvector was included in the logistic regression model 
as a covariate for genetic association analysis [1 1] . Hardy- 
Weinberg equilibrium was evaluated using the %2 test in all of 
the controls. The genome-wide association analysis was performed 
using an additive model in a logistic regression analysis in PLINK 
1.07 (http://pngu.mgh.harvard.edu/~purcell/plink/). The .statis- 
tical significance was di;clared at a = 0.05. P values were two- 
sided, and ORs were estimated with an additive model by logistic 
regression analyses. The Unkage disequilibrium (LD) analysis of the 
3 SNPs (rs870142, rsl6835979 and rs6824295) was performed in 
the controls of GWAS by using PLINK 1 .07. Other analyses were 
performed using Stata version 9.2 (StataCorp LP, TX). The Chi- 
squared based Cochran's Q_ statistic was calculated to test for 
heterogeneity between groups in stratified analysis. We used 
inverse-variance-weighted model for meta-analysis when perform- 
ing combined analysis. Random-effects model was used when 
heterogeneity between studies exists, that is, P value for 
heterogeneity test was less than 0.05; otherwise, fixed-effect model 
was used. 

Results 

The characteristics of the 945 CHD cases and the 1,246 non- 
CHD controls in GWA scan have been described previously [1 1]. 
Moreover, selected characteristics of the 1,709 cases and 1,962 
non-CHD controls in the replication study were shown in 
Table 2 and there were similar distributions of age and sex 
between cases and controls (P = 0.221 and 0.870, re.spectively). 

The genotype distributions of these 3 SNPs in cases and controls 
of GWA scan were shown in Table 3. The observed genotype 
frequencies were all in Hardy-Weinberg equilibrium in controls 
(P = 0.337 for r,s870142, P = 0.370 for rsl6835979 and P = 0.385 
for rs6824295). Logistic regression analyses showed that no 
obvious associations were found between the 3 SNPs and overall 
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Table 1. Primers and probes of rsl 6835979. 



rsl 6835979 

F-primer TGTGGACTCTAGAATGGACTTCCA 

R-primer TGTCAGATTAGGACCATCTCCATCT 

FAM-probe FAM-AGTGACTTTACTGTCCTC-MGB 

HEX-probe HEX-AAGTGACTTTAATGTCC-MGB 



doi:l 0.1 371 /journal.pone.Ol 0741 1 .tOOl 

risk of CHD in an additive model (Table 3). Then, we performed 
stratification analysis by different diagnostic groups (ASD, VSD 
and ASD combined VSD) and found tliat all 3 SNPs showed 
significant associations with ASD risk (N = 334 cases) in the same 
direction as that observed in Cordell's study (rs870142: OR = 1 .24, 
95% CI= 1.04-1.49, P = 0.016; rsl6835979: OR= 1.25, 95% 
CI= 1.05-1.50, P = 0.014; rs6824295: OR= 1.24, 95% 
CI= 1.04-1.48, P = 0.019). However, no significant association 
was observed in the other two disease groups (Table 3). 

As these 3 SNPs were in high LD (r^ = 0.86-0.99) in Chinese 
population, we selected one SNP with the lowest P value in our 
GWAS scan (rsl6835979) to perform a replication study with 
additional 1,709 CHD cases of multiple phenotypes (ASD: 
N = 367; VSD: N = 432, Tetralogy of Fallot (TOF): N = 211, 
other CHDs: N = 699) and 1,962 controls. Similarly, no significant 
association was observed when all phenotypes were combined 
together. In the stratification analysis by disease groups, 
rsl6835979 only showed significant association with the risk of 
ASD (OR= 1.20, 95% CI= 1.02-1.42, P = 0.030), which was 
heterogeneous from the other subgroups significandy [P for 
heterogeneity — 0.046) (Table 4). In combined GWAS and 
replication samples, the minor allele of rsl6835979 remained 
significandy associated with ASD risk (OR= 1.22, 95%CI= 1.08- 
1.38, P = 0.001) (Table 5). 

Discussion 

In this study, we attempted to replicate the 4pl6 ASD 
susceptibility loci in Han Chinese population and found that our 
results were consistent with the results presented by CordeU el al to 
a certain extent. These findings suggest that the susceptibility loci 



of diseases may have some similarities among different races and 
ethnicities. 

In Cordell's study, 3 SNPs (rs870142, rsl6835979 and 
rs6824295) were identified to be significantly associated with 
ASD risk [12]. We performed a LD analysis among the 3 SNPs by 
the online software of SNAP (http://www.broadinstitute.org/ 
mpg/siiap/ldsearchpw.php) and found strong LD between them 
(r^ = 0.89 to 0.96) in CEU (CEPH- Utah residents with European 
ancestry). After evaluating the LD between the 3 SNPs in the 
controls of our GWAS samples, we found that they were also in 
high LD in Chinese populations (f^ = 0.86-0.99). There were 28 
other SNPs in the LD plot of the region containing the 3 
mentioned SNPs (chr4:4665 18 1-4698948) and we found that most 
SNPs were in high LD with the 3 SNPs in European population, 
while only about half of them were in high LD with the 3 SNPs in 
Chinese population. Although we included 701 ASD cases and 
3,208 controls in this study (334 cases and 1,246 controls in our 
GWA scan and 367 ASD cases and 1,962 controls in the 
replication study), the sample size, especially the case sample size 
was relatively smaller compared to the original Cordell's GWAS. 
After performing multiple testing corrections with the Bonferroni 
single-step method (corrected a: 0.05/3 = 0.017), we found the 
SNPs rs870142 and rsl6835979 remained association with ASD 
risk; however, the SNP rs6824295 lost its association with ASD. 
Given the high LD of the 3 SNPs, Bonferroni correction might be 
too strict. Additionally, owing to the genetic heterogeneity, the 
allele frequency of the SNP rsl 6835979 in both populations was 
dififerent (Chinese population: MAF = 0.380 in cases, 
MAF = 0.334 in controls; European population: MAF = 0.314 in 
cases, MAF = 0.235 in controls) and the effect size in the current 
study was weaker than that in European population (OR =1.22 in 



Table 2. Characteristics of CHD cases In validation stage. 



Variables 


Case (N = 1,709) 


Control (N = 1,962) 


P 


Age (mean±SD) 


3.21 ±8.70 


3.45±7.93 


0.221 


Sex 


Male 


738 (43.2%) 


842 (42.9%) 


0.870 


Female 


971 (56.8%) 


1,120 (57.1%) 




Phenotype 


ASD 


367 






VSD 


432 






TOF 


211 






Other CHDs 


699 







ASD: ostium secundum atrial septal defect; VSD: ventricular septal defect; TOF: tetralogy of fallot; Other CHDs included patent ductus arteriosus, ebstein anomaly, 
double-chambered right ventricle, bicuspid aortic valve, aortic insufficiency, aortic stenosis, mitral insufficiency, rupture of valsalva sinus aneurysm, etc. 
P-values were derived from the x2 test for the categorical variable (sex) and t-test for the continuous variable (age). 
doi:1 0.1 371 /journal.pone.Ol 0741 1 .t002 
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Chinese population vs. OR — 1 .46 in European population). Using 
Levin's formula [13], we also found that the population- 
attributable risk of ASD for the SNP rs 16835979 was relatively 
lower in Chinese (7"/()) than that in Caucasian (9%). 

All 3 SNPs locate within an intron of the noncoding RNA gene 
LOC100507266 with unknown function. Meanwhile, they also lie 
in the 320-kb interval between STX18 (syntaxin 18) and MSXl 
(msh homeobox 1) and locate upstream of both genes. The long 
non-coding RNA gene LOC100507266 is known as STX18 
antisense RNA 1 (STX18-AS1). STX18 is principally located in 
the endoplasmic reticulum, and it plays roles in transport between 
the endoplasmic reticulum and Golgi [14]; however, no data are 
available for its function in heart development. MSXl has been 
showed to be expressed in mesenchymal structures, including the 
limb buds, pharyngeal arches, neural crest, and endocardial 
cushions of the heart [1.5,16]. In CordeU's study, MSXl was 
showed to be expressed in the atrial septum during development, 
both in mouse and chick [12]. MSXl and MSX2, which belong to 
the highly conserved Nk family of homeobox genes, display 
overlapping expression patterns and redundant functions in 
multiple tissues and organs during vertebrate development 
[15,16]. The functions of Msxl and Msx2 in endocardial 
activation prior to EMT via upregulation of NFATcl expression 
have been documented [15]. In addition, Msxl has a well- 
documented role as both a downstream effector and an upstream 
regulator of Bmp signaling. It has been implicated in regulating 
epithelial-mesenchymal transition (EMT) during cardiac valve and 
septa formation [15,17]. Handl and Hand2 are also candidate 
target genes regulated by Msxl and Msx2 [18]. Since myocardi- 
um-specific Handl deficiency leads to thickened atrioventricula 
(AV) valves, which is associated with impaired valve remodeling 
[19-21], decreasing Handl expression in the Msxl/2 double 
mutant AV myocardium may be associated with defects in 
remodeling of the AV cushions into mature valve leaflets [15]. In 
the mouse, loss both of Msxl and Msx2 leads to a severe 
endocardial cu.shion and conotruncal defects [15,18]. In humans, 
mutations in MSXl cause orofacial clefting and tooth agenesis 
[22,23]. It is speculated that these 3 SNPs may affect the 
regulation of above genes and thus have associations with the 

References 

1. Bruneau BG (2008) The developmental genetics of congenital heart disease. 
Nature 451: 943-948. 

2. van der Lindc D, Konings EE, Slager MA, Witscnburg M, Helbing WA, et al. 
(2011) Birth prr\'alcncc of congenital heart disease worldwide: a systematic 
review and meta-analysis. J .Am Coll Cardiol .58: 2241-2247. 

3. Rcller MD, Strickland MJ, Riehle-Golarusso T, Mahlc WT, Corrca A (2008) 
Prevalence of congenital heart defects in metropolitan Atlanta, 1998-200,'i. 
J Pcdiatr l.xS: 807-813. 

4. Lozano R, Naghavi M, Foreman K, Lim S, Shibuya K, et al. (2013) Global and 
regional mortality from 235 causes of death for 20 age groups in 1990 and 2010: 
a systematic analysis for the Global Burden of Disease Study 2010. Lancet 380: 
2095-2128. 

5. Hu Z, Yuan X, Rao K, Zheng Z, Hu S (2014) National trend in congenital heart 
disease mortality in China during 2003 to 2010: A population-based study. J 
Thorac Cardiovasc Surg 148: 596-602 e591. 

6. Pierpont ME, Basson CT, Benson DVV Jr, Gelb BD, Giglia T.M. et al. (2007) 
Cjcnetie basis lor congenital heart defects: current knowledge: a seientilic 
statement from the American Heart Association Congenital Cardiac Defects 
Committee, Council on Cardiovascular Disease in the Young: endorsed by the 
American Academy of Pediatrics. Circulation 115: 3015-3038. 

7. Weismann CG, Gelb BD (2007) The genetics of congenital heart disease: a 
review of recent developments. Curr Opin Cardiol 22: 200-206. 

8. Ware SM, Jefferies JL (2012) New Genetic Insights into Congenital Heart 
Disease. J Clin Exp Cardiolog S8. 

9. Wessels MW, Willcms PJ (2010) Genetic factors in non-syndromic congenital 
heart malformations. Clin Genet 78: 103—123. 

10. Shen H, Jin G (2013) Human genome epidemiology, progress and future. 
J Biomed Res 27: 167-169. 



development of ASD; however, the specific mechanisms of the 3 
SNPs affecting the risk of ASD stiU need to be further investigated. 

Like CordeU's study, we did not observe genome-wide 
significant associations of 4p 1 6 loci with overall CHD risk in our 
GWAS as well as rephcation study. The findings suggested that 
genetic etiology of CHD could have a considerable degree of 
phenotypic specificity, which was also identified by some recent 
studies [24,25]. For the various phenotypes of CHD originated 
from different part of heart, we beUeve different genetic factors are 
involved in their pathogenic process. 

Interestingly, our previous GWAS has identified 2 susceptibility 
loci (rs2474937 at lpl2 and rsl531070 at 4q31.1) for CHD in 
Han Chinese. However, these variants may contribute to the risk 
of CHD through different pathways. As described previously, the 
SNP rs2474937 may affect the expression oiTBXlS, a member of 
T-box transcription factors that have key roles in the development 
of the embr)'onic mesoderm, including of the heart and skeleton. 
Another SNP rs 153 1070 is in a LD region overlapping the 
MAML3 (mastermind-like 3) gene and mastermind (Mam) is one 
of the essential components of Notch signaling pathway. In the 
current study, we confirmed the ASD-specific association of SNP 
rs 16835979 in Chinese. This SNP may influence MSXl (msh 
homeobox 1), which belongs to the highly conserved Nh family of 
homeobox genes. MSXl has functions in endocardial activation 
prior to epithelial-mesenchymal transition (EMT) as well as Bmp 
signaling. Thus, the findings in the current study supplemented the 
existing results of our GWAS. 

In conclusion, we present e\'idence for association between the 
common SNP rsl6835979 at 4pl6 and the ASD-specific risk in 
Han Chinese, which extends CordeU's results in different 
populations and suggests that the susceptibUity loci of cUsease 
indeed have some similarities among different races and ethnic- 
ities. 
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